智能论文笔记

KPT: Keyword-guided Pre-training for Grounded Dialog Generation

Qi Zhu , Fei Mi , Zheng Zhang , Yasheng Wang , Yitong Li , Xin Jiang , Qun Liu , Xiaoyan Zhu , Minlie Huang

分类：自然语言处理

2022-12-04

Incorporating external knowledge into the response generation process is essential to building more helpful and reliable dialog agents. However, collecting knowledge-grounded conversations is often costly, calling for a better pre-trained model for grounded dialog generation that generalizes well w.r.t. different types of knowledge. In this work, we propose KPT (Keyword-guided Pre-Training), a novel self-supervised pre-training method for grounded dialog generation without relying on extra knowledge annotation. Specifically, we use a pre-trained language model to extract the most uncertain tokens in the dialog as keywords. With these keywords, we construct two kinds of knowledge and pre-train a knowledge-grounded response generation model, aiming at handling two different scenarios: (1) the knowledge should be faithfully grounded; (2) it can be selectively used. For the former, the grounding knowledge consists of keywords extracted from the response. For the latter, the grounding knowledge is additionally augmented with keywords extracted from other utterances in the same dialog. Since the knowledge is extracted from the dialog itself, KPT can be easily performed on a large volume and variety of dialogue data. We considered three data sources (open-domain, task-oriented, conversational QA) with a total of 2.5M dialogues. We conduct extensive experiments on various few-shot knowledge-grounded generation tasks, including grounding on dialog acts, knowledge graphs, persona descriptions, and Wikipedia passages. Our comprehensive experiments and analyses demonstrate that KPT consistently outperforms state-of-the-art methods on these tasks with diverse grounding knowledge.

translated by 谷歌翻译

EBHI-Seg: A Novel Enteroscope Biopsy Histopathological Haematoxylin and Eosin Image Dataset for Image Segmentation Tasks

Liyu Shi , Xiaoyan Li , Weiming Hua , Haoyuan Chen , Jing Chen , Zizhen Fan , Minghe Gao , Yujie Jing , Guotao Lu , Deguo Ma

分类：计算机视觉

2022-12-01

Background and Purpose: Colorectal cancer is a common fatal malignancy, the fourth most common cancer in men, and the third most common cancer in women worldwide. Timely detection of cancer in its early stages is essential for treating the disease. Currently, there is a lack of datasets for histopathological image segmentation of rectal cancer, which often hampers the assessment accuracy when computer technology is used to aid in diagnosis. Methods: This present study provided a new publicly available Enteroscope Biopsy Histopathological Hematoxylin and Eosin Image Dataset for Image Segmentation Tasks (EBHI-Seg). To demonstrate the validity and extensiveness of EBHI-Seg, the experimental results for EBHI-Seg are evaluated using classical machine learning methods and deep learning methods. Results: The experimental results showed that deep learning methods had a better image segmentation performance when utilizing EBHI-Seg. The maximum accuracy of the Dice evaluation metric for the classical machine learning method is 0.948, while the Dice evaluation metric for the deep learning method is 0.965. Conclusion: This publicly available dataset contained 5,170 images of six types of tumor differentiation stages and the corresponding ground truth images. The dataset can provide researchers with new segmentation algorithms for medical diagnosis of colorectal cancer, which can be used in the clinical setting to help doctors and patients.

translated by 谷歌翻译

An Algebraic Framework for Stock & Flow Diagrams and Dynamical Systems Using Category Theory

John C. Baez , Xiaoyan Li , Sophie Libkind , Nathaniel D. Osgood , Eric Redekopp

分类：自然语言处理

2022-11-01

Stock and flow diagrams are already an important tool in epidemiology, but category theory lets us go further and treat these diagrams as mathematical entities in their own right. In this chapter we use communicable disease models created with our software, StockFlow.jl, to explain the benefits of the categorical approach. We first explain the category of stock-flow diagrams, and note the clear separation between the syntax of these diagrams and their semantics, demonstrating three examples of semantics already implemented in the software: ODEs, causal loop diagrams, and system structure diagrams. We then turn to two methods for building large stock-flow diagrams from smaller ones in a modular fashion: composition and stratification. Finally, we introduce the open-source ModelCollab software for diagram-based collaborative modeling. The graphical user interface of this web-based software lets modelers take advantage of the ideas discussed here without any knowledge of their categorical foundations.

translated by 谷歌翻译

Delving into the Frequency: Temporally Consistent Human Motion Transfer in the Fourier Space

Guang Yang , Wu Liu , Xinchen Liu , Xiaoyan Gu , Juan Cao , Jintao Li

分类：计算机视觉

2022-09-01

人类运动转移是指合成的照片现实和时间连贯的视频，使一个人能够模仿他人的运动。但是，当前的合成视频遭受了序列帧的时间不一致，这些框架显着降低了视频质量，但远未通过像素域中的现有方法来解决。最近，由于图像合成方法的频率不足，一些有关DeepFake检测的作品试图区分频域中的自然图像和合成图像。尽管如此，从自然和合成视频之间的频域间隙方面的各个方面研究合成视频的时间不一致。在本文中，我们建议深入研究频率空间，以进行时间一致的人类运动转移。首先，我们对频域中的自然和合成视频进行了首次综合分析，以揭示单个帧的空间维度和视频的时间维度的频率差距。为了弥补自然视频和合成视频之间的频率差距，我们提出了一个新型的基于频率的人类运动转移框架，名为Fremotr，该框架可以有效地减轻空间伪像以及合成视频的时间不一致。 Fremotr探索了两个基于频率的新型正则化模块：1）频域外观正则化（FAR），以改善个人在单个帧中的外观和2）时间频率正则化（TFR），以确保相邻框架之间的时间一致性。最后，全面的实验表明，FremoTR不仅在时间一致性指标中产生卓越的性能，而且还提高了合成视频的框架级视觉质量。特别是，时间一致性指标比最新模型提高了近30％。

translated by 谷歌翻译

IL-MCAM: An interactive learning and multi-channel attention mechanism-based weakly supervised colorectal histopathology image classification approach

Haoyuan Chen , Chen Li , Xiaoyan Li , Md Mamunur Rahaman , Weiming Hu , Yixin Li , Wanli Liu , Changhao Sun , Hongzan Sun , Xinyu Huang

分类：计算机视觉

2022-06-07

近年来，大肠癌已成为危害人类健康最重要的疾病之一。深度学习方法对于结直肠组织病理学图像的分类越来越重要。但是，现有方法更多地集中在使用计算机而不是人类计算机交互的端到端自动分类。在本文中，我们提出了一个IL-MCAM框架。它基于注意机制和互动学习。提出的IL-MCAM框架包括两个阶段：自动学习（AL）和交互性学习（IL）。在AL阶段，使用包含三种不同注意机制通道和卷积神经网络的多通道注意机制模型用于提取多通道特征进行分类。在IL阶段，提出的IL-MCAM框架不断地将错误分类的图像添加到交互式方法中，从而提高了MCAM模型的分类能力。我们对数据集进行了比较实验，并在HE-NCT-CRC-100K数据集上进行了扩展实验，以验证拟议的IL-MCAM框架的性能，分别达到98.98％和99.77％的分类精度。此外，我们进行了消融实验和互换性实验，以验证三个通道的能力和互换性。实验结果表明，所提出的IL-MCAM框架在结直肠组织病理学图像分类任务中具有出色的性能。

translated by 谷歌翻译

CTRLEval: An Unsupervised Reference-Free Metric for Evaluating Controlled Text Generation

Pei Ke , Hao Zhou , Yankai Lin , Peng Li , Jie Zhou , Xiaoyan Zhu , Minlie Huang

分类：自然语言处理 | 人工智能

2022-04-02

Existing reference-free metrics have obvious limitations for evaluating controlled text generation models. Unsupervised metrics can only provide a task-agnostic evaluation result which correlates weakly with human judgments, whereas supervised ones may overfit task-specific data with poor generalization ability to other datasets. In this paper, we propose an unsupervised reference-free metric called CTRLEval, which evaluates controlled text generation from different aspects by formulating each aspect into multiple text infilling tasks. On top of these tasks, the metric assembles the generation probabilities from a pre-trained language model without any model training. Experimental results show that our metric has higher correlations with human judgments than other baselines, while obtaining better generalization of evaluating generated texts from different models and with different qualities.

translated by 谷歌翻译

A 3D 2D convolutional Neural Network Model for Hyperspectral Image Classification

Jiaxin Cao , Xiaoyan Li

分类：计算机视觉 | 机器学习

2021-11-19

在所提出的Sehybridsn模型中，使用密集块来重用浅特征，并旨在更好地利用分层空间谱特征。随后的深度可分离卷积层用于区分空间信息。通过通道注意方法实现了空间谱特征的进一步改进，该方法在每个3D卷积层和每个2D卷积层后面进行。实验结果表明，我们所提出的模型使用很少的训练数据了解更多辨别的空间谱特征。Sehybridsn使用仅0.05和0.01个标记的训练数据，获得了非常令人满意的性能。

translated by 谷歌翻译

Dual Progressive Prototype Network for Generalized Zero-Shot Learning

Chaoqun Wang , Shaobo Min , Xuejin Chen , Xiaoyan Sun , Houqiang Li

分类：计算机视觉

2021-11-03

广义零射击学习（GZSL）旨在识别具有辅助语义信息的新类别，例如，类别属性。在本文中，我们通过逐步提高视觉表现的跨域可转换性和类别辨认性，处理域移位问题的临界问题，即观看和看不见的类别之间的困惑。我们命名为双渐进式原型网络（DPPN）的方法构造了两种类型的原型，分别为属性和类别记录原型视觉模式。使用属性原型，DPPN交替地搜索与属性相关的本地区域并更新相应的属性原型以逐步探索准确的属性区域对应。这使DPPN能够产生具有精确属性定位能力的可视表示，这有利于语义 - 视觉对齐和表示转换性。此外，除了渐进属性本地化之外，DPPN还将项目类别原型进一步投影到多个空间中，以逐步排斥来自不同类别的视觉表示，这提高了类别辨别性。属性和类别原型都在统一的框架中进行了协作学习，这使得DPPN可转移和独特的视觉表示。四个基准测试的实验证明，DPPN有效地减轻了GZSL中的域移位问题。

translated by 谷歌翻译

GasHisSDB: A New Gastric Histopathology Image Dataset for Computer Aided Diagnosis of Gastric Cancer

Weiming Hu , Chen Li , Xiaoyan Li , Md Mamunur Rahaman , Jiquan Ma , Yong Zhang , Haoyuan Chen , Wanli Liu , Changhao Sun , Yudong Yao

分类：计算机视觉

2021-06-04

背景和目的：胃癌已经成为全球第五次常见的癌症，早期检测胃癌对于拯救生命至关重要。胃癌的组织病理学检查是诊断胃癌的金标准。然而，计算机辅助诊断技术是挑战，以评估由于公开胃组织病理学图像数据集的稀缺而评估。方法：在本文中，公布了一种贵族公共胃组织病理学子尺寸图像数据库（GashissdB）以识别分类器的性能。具体地，包括两种类型的数据：正常和异常，总共245,196个组织案例图像。为了证明图像分类领域的不同时期的方法在GashissdB上具有差异，我们选择各种分类器进行评估。选择七种古典机器学习分类器，三个卷积神经网络分类器和新颖的基于变压器的分类器进行测试，用于测试图像分类任务。结果：本研究采用传统机器学习和深入学习方法进行了广泛的实验，以证明不同时期的方法对GashissdB具有差异。传统的机器学习实现了86.08％的最佳精度率，最低仅为41.12％。深度学习的最佳准确性达到96.47％，最低为86.21％。分类器的精度率显着变化。结论：据我们所知，它是第一个公开的胃癌组织病理学数据集，包含大量的弱监督学习的图像。我们认为Gashissdb可以吸引研究人员来探索胃癌自动诊断的新算法，这可以帮助医生和临床环境中的患者。

translated by 谷歌翻译

GasHis-Transformer: A Multi-scale Visual Transformer Approach for Gastric Histopathology Image Classification

Haoyuan Chen , Chen Li , Xiaoyan Li , Ge Wang , Weiming Hu , Yixin Li , Wanli Liu , Changhao Sun , Yudong Yao , Yueyang Teng

分类：计算机视觉

2021-04-29

现有的胃癌诊断深层学习方法，常用卷积神经网络。最近，视觉变压器由于其性能和效率而引起了极大的关注，但其应用主要在计算机视野领域。本文提出了一种用于Gashis变压器的多尺度视觉变压器模型，用于胃组织病理学图像分类（GHIC），其使微观胃图像自动分类为异常和正常情况。 GASHIS-COMPURANCER模型由两个关键模块组成：全球信息模块和局部信息模块有效提取组织病理特征。在我们的实验中，具有280个异常和正常图像的公共血毒素和曙红（H＆E）染色的胃组织病理学数据集分为训练，验证和测试组，比率为1：1：2胃组织病理学数据集测试组精度，召回，F1分数和准确性分别为98.0％，100.0％，96.0％和98.0％。此外，进行了关键的研究以评估Gashis变压器的稳健性，其中添加了10个不同的噪声，包括四种对抗性攻击和六种传统图像噪声。此外，执行临床上有意义的研究以测试Gashis变压器的胃肠癌鉴定性能，具有620个异常图像，精度达到96.8％。最后，进行比较研究以测试在淋巴瘤图像数据集和乳腺癌数据集上的H＆E和免疫组织化学染色图像的概括性，产生可比的F1分数（85.6％和82.8％）和精度（83.9％和89.4％），分别。总之，Gashistransformer演示了高分类性能，并在GHIC任务中显示出其显着潜力。

translated by 谷歌翻译